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COMPRESSING XML DOCUMENTS INTO VALID XML DOCUMENTS 

5 TECHNICAL FIELD 

This document relates generally to compression algorithms for data files 
and in particular to compressing extensible markup language (XML) documents. 

BACKGROUND 

10 The extensible markup language (XML) is a language that is written in 

file standardized general markup language (SGML). SGML is an international 
standard meta-Ianguage for text markup applications (ISO 8879). XML is a 
human-ieadable, text-based language making it easy to use. Partly because 
XML is written in an international standard and partly because of its ease of use, 

1 5 XML is widely used in a variety of applications. Another advantage is that 
XML files or documents explicitly flag the type of data contained in the 
documents by enclosing blocks of data with labels to declare the type of XML 
elements contained in a block. This makes XML documents data-type aware. 

However, because it is human-readable and because it is data-type aware, 

20 XML can be a verbose language. Human-readable data files are larger compared 
to other formats (such as binary formats for example) and the data-type 
declarations expand the size of data files. Large XML files may cause problems 
in systems that are memory constrained or in communication systems having 
channels that are bandwidth limited. 

25 

SUMMARY 

This document describes both devices and methods used to manage 
extensible markup language (XML) files or documents. One method example 
comprises compressing a first XML document into a binary stream, converting 
30 the binary stream into a compressed valid XML document, and associating at 
least one XML tag with the compressed valid XML document in order to 
identify the document as a compressed XML document 
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One device Ktample includes at least one processor, a network interface 
to communicate with the at least one processor and a network, and an XML 
document processing module. The XML document proces^ng module includes 
a compression module to compress XML documents into compressed valid 
5 XML documents. 

BRIEF DESCRIPTION OF THE DRAWINGS 
FIG. 1 shows a block diagram of one embodiment of a method of 
managing XML documents. 
1 0 FIG. 2 shows a block diagram of another embodiment of a method of 

managing XML documents. 

FIG. 3 is block diagram illustrating portions of a network device operable 
to manage XML documents. 

FIG. 4 is block diagram illustrating portions of another embodiment of a 
1 5 network device operable to manage XML docimients, 

FIG. 5 is a block diagram of portions of an embodiment of a system for 
managing XML documents. 

FIG. 6 is an embodiment using an XML tag at the beginning and end of 

the file. 

20 FIG. 7 is an original XML document configuration file. 

FIG. 8 is the compressed version of the document. 

DETAILED DESCRIPTION 
In the following detailed description, reference is made to the 
25 accompanying drawings which form a part hereof, and in which is shown by 
way of illustration specific embodiments in which the invention may be 
practiced. It is to be understood that other embodiments may be utilized and 
structural changes may be made without departing from the scope of the present 
invention. 

30 This document discusses, among other things, methods and devices for 

managing extensible markup language (XML) files or documents. Because 
XML is widely used, many applications that use XML would benefit from 
reducing the size of XML documents. This is especially true where the 
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applications are memory constrained such as in embedded systems. Application 
files that are reduced in size would allow the files to be stored using less 
memory. Applications that include bandwidth limited communication systems 
would also benefit from reducing the size of XML files. These applications 

5 include those that are slow, sudi as a slow serial line, or those that experience a 
large amount of communication traffic such as a wide area network (WAN). 
These applications would benefit fix)m minimizing traffic by minimizing the 
amount of data transferred* 

To manage the size of large XML documents, the documents are 

1 0 compressed. In contrast to typical compression methods however, documents 
compressed under the methods of the present application remain valid XML 
documents. A valid XML document is a document that is well formed and has 
an associated document-type declaration. This allows the compressed valid 
XML document to be recognized and accessed by applications that process XML 

15 documents. 

FIG. I shows a block diagram 100 of one embodiment of a method of 
managing XML documents. The method includes reducing the size of the 
document by compressing it. At 1 10, an XML document is compressed mto a 
binary stream. Because XML documents are plain text, they are very redundant. 

20 Any compression method that results in good compression ratios on redundant 
text streams may be used. A 70% compression ratio is a typical good 
compression ratio. In one embodiment, the compression method is a deflate 
compression algorithm, such as RFC 1951 for example. 

At 1 20, the binary stream is converted into a compressed valid XML 

25 document To accomplish this, the binary stream is expanded back into text 
This is necessary because valid XML documents cannot contain binary data. 
Preferably, the binary stream is expanded using base-64 ^coding, but any 
encoding mechanism that has characteristics similar to base-64 encoding may be 
used. A mechanism with similar characteristics refers to an encoding 

30 mechanism that takes binary bytes of data and converts them into printable 

characters in the American Standard Code for Infonnation Interchange (ASCII) 
standard. For example, UUencode has similar characteristics. 
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If ba5e-64 encoding is not used, the resulting text must be seaidied 
through after encoding. Any characters that are present in the text that would 
cause the resulting document to be invalid XML must be converted. For 
example, use of the or '>* characters would result in an invalid XML 

5 document, and the characters must be replaced with the standard XML 

leplaconent text *&lt' and *&gt\ respectively. Base-64 does not require this 
replacement step because the resulting characters only consist of upper and 
lower case A-Z, numerals 0-9, V\ and which are valid XML characters. 
The result of using binary to ASCII encoding is an expansion of the binary 

10 stream back into a text file with an expansion ratio of about 33%. 

The net result of the compression and expansion is approximately a 2,5 to 
1 compression from the original XML document. At 130, at least one XML tag 
is associated with the compressed valid XML document in order to identify the 
document as a compressed XML document. An example of an embodiment 

1 5 using an XML tag at the beginning and end of the file is shown in FIG. 6. 

In a specific example, the XML document is a configuration file used to 
configure a remote device. A description of using XML documents to configure 
remote devices is included in co-pending U.S. Patent Application 10/873,051, 
entitied "DEVICE SERVER ACCESS USING A DATA TYPE AWARE 

20 MARK-UP LANGUAGE," whidi is incorporated herein by reference. An 
example original XML document configuration file is shown in FIG. 7. The 
compressed version of the document of FIG. 7 is shown in FIG. 8. 

The compressed document looks random because of compression and 
expansion with encoding. However, the compressed document is a valid XML 

25 document. 

Note that due to space limitations, the original XML document is small, 
about 440 bytes. This results in the compressed document in the example being 
larger than the original XML document. Because of the way compression 
algorithms work, the algorithms only compress well when the original file is 
30 large, e.g., greater than 4000 bytes. For smaller documents, the amount of 
memory used or the amount of time spent in sending the document is not an 
issue and the method to manage the XML documents may not be needed. 
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Because the compressed document is valid XML, any application that 
can read XML documents will recognize the document and can access its 
contents. To use the document, the application must decompress the document 
to return the docummt to its original format. This involves reversing the 
5 compression and encoding process. Therefore, a further anbodiment of the 
method 100 for manag^ig XML documaits includes reconverting the 
compressed valid XML document into a binary stream, and decompressing the , 
binary stream to obtain the first XML document In one embodiment, 
reconverting into binary includes reverse base-64 encoding, and decompressing 
10 includes running a reverse deflate algorithm on the interim binary stream. If a 
mechanism other than base-64 encoding is used, the replaced characters must be 
reconverted from their XML replacements. 

FIG. 2 shows a block diagram of an embodiment of a method 200 that 
includes compressing and encoding a first XML document and then transferring 
15 a compressed valid XML document over a network. The first XML document is 
any genwic document, such as a status message for example. At 2 10 a first 
XML document is compressed into a binary stream. At 220, the binary stream is 
converted into a compressed valid XML document. The compressing and 
encoding are accomplished by any of the methods discussed previously. At 230, 
20 at least one XML tag is associated with the compressed valid XML document 
that identifies the document as a compressed XML document. At 240, the 
compressed valid XML document is transferred over a networic to a receiving 
device. At 250, the transferred document is recognized as a compressed valid 
XML document. According to some embodiments, a master device such as a 
25 master processor generates the compressed valid XML document and initiates 
the transfer to a remote device that recognizes the XML document from the at 
least one XML identifying tag. In an example of one such embodiment, the first 
XML document includes a configuration file. In another example, the first XML 
document includes a status message. In yet another example, the first XML 
30 document includes a conunand message. According to other embodiments, a 
remote device generates the compressed valid XML document and initiates the 
transfer. At 260, the compressed valid XML document is reconverted into a 
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binary stream. At 270, the binary stream is decompressed to obtain the first 
XML document A receiving device is then able to process the XML document. 

In yet another embodiment, transferring the compressed vaKd XML 
document over a n^ork includes transferring the compressed valid XML 

5 document over a serial communications network, such as a network fliat uses an 
RS232 protocol or an Ethernet network. In yet another embodiment, transferring 
the compressed XML document over a network includes traiwfenring flie 
compressed valid XML documart over a wireless network, such as a wireless 
local area network (WLAN) or a mobile phone network. In a further 

10 embodiment, transferring the compressed XML document ovct a network 

includes transferring the compressed valid XML document over the internet In 
other embodiments, one or a combination of the several method embodiments 
are provided on a computer readable medixrai such as a diskette or CD ROM. 
FIG. 3 is block diagram illustrating portions of a network device 300 

1 5 operable to manage XML documents. The network device 300 includes at least 
one processor 310, a network interface 320 to communicate with the at least one 
processor 3 10 and a network 330, and an XML document processing module 
340. 

The XML document processing module 340 includes a compression 
20 module 350 to generate compressed valid XML documMts. In one embodimait, 
the XML document processing module 350 includes a deflate compression 
algorithm. In another embodiment, the XML document processing module 350 
includes a binary to ASCII text encoding algorithm. In one sudi embodiment, 
the binary to ASCII text encoding algorithm includes a base-64 encoding 
25 algorithm. 

In some embodiments, the network device 300 is an embedded device 
server operable to manage a remote device using XML documents. In another 
embodiment, the network interface 320 includes a serial port. In another 
embodiment, the network interface 320 includes a web interface, to another 
30 embodiment, the network 330 is a wireless network, to one such embodiment, 
die network device 300 is included in a cell phone, to another embodiment, flie 
network 330 is a wireless local area network (WLAN) and the network device 
300 is included in a WLAN computer card. 
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FIG. 4 is a block diagram of portions of an embodiment of a network 
device 400 operable to recognize a compressed valid XML document and 
revei:se the compression process. The network device 400 mcludes at least one 
processor 410, a network interface 420 to communicate with the at least one 
5 processor 41 0 and a network 430, and an XML document processing module 
440 that includes a compression module 4S0. To reverse the compression 
process, the network device 400 includes a decompression module to 460 
decompress compressed valid XML documents. In one embodiment, the 
decompression module 460 includes a re-conversion algorithm to reconvert a 

10 compressed valid XML document into a binary stream and a reverse deflate 
algorithm to convert the interim binary stream into an XML document 

FIG. 5 is a block diagram of portions of an embodiment of a system 500 
for communicating XML documents. The system 500 includes a 
communication network 530 and at least first and second network devices 505A- 

15 B to communicate over the network 530. Each network device 505A-B includes 
at least one processor 5 1 OA-B, a network interface 520A-B to communicate with 
the at least one processor 51 OA-B and the network 530, and an XML document 
processing module 540A-B. An XML document processing module 540A-B 
includes a compression module 550A-B to compress XML documents into 

20 compressed valid XML documents and a decompression module 560B to 
decompress compressed valid XML documents. 

In one embodiment, a first networic device 505 A is operable to transfer a 
status message over the networic 530 as a compressed valid XML document to a 
second network device 505B. In another embodiment, a first network device 

25 505A is an embedded device s^^ operable to receive a device configuration 
file as a compressed valid XML document over the network 530 and decompress 
the docum^t. According to other embodiments, the network 530 is a serial 
communication network. In other embodiments, the network 530 is a wireless 
communication network. 

30 The accompanying drawings that form a part hereof, show by way of 

illustration, and not of limitation, specific embodiments in which the subject 
matter may be practiced. The embodiments illustrated are described in sufficient 
detail to enable those skilled in the art to practice the teachings disclosed herein. 
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Other embodiments may be utilized and derived thereftom, such that structural 
and logical substitutions and changes may be made without departing from the 
scope of fliis disclosure. This Detailed Description, therefore, is not to be taken 
in a limiting sense, and the scope of various embodiments is defined only by the 
5 appended claims, along with the full range of equivalents to which such claims 
are entitled. 

Such embodiments of the inventive subject matter may be referred to 
herein, individually and/or collectively, by the term "invention** merely for 
convenience and without intending to voluntarily limit the scope of this 

10 application to any single invention or inventive concept if more than one is in 
fact disclosed. Hius, although specific embodiments have been illustrated and 
described herein, it should be appreciated that any arrangement calculated to 
achieve the same purpose maybe substituted for the specific embodiments 
shown. This disclosure is intended to cover any and all adaptations or variations 

15 of various embodiments. Combinations of the above embodiments, and other 
embodiments not specifically desaibed herein, will be apparmt to those of skill 
in the ait upon reviewing the above description. 

, The Abstract of the Disclosure is provided to comply with 37 C.F.R. 
§1, 72(b), requiring an abstract that will allow the reader to quickly ascertain the 

20 nature of the technical disclosure. It is submitted with the understanding that it 
will not be used to interpret or limit flie scope or meanmg of the claims. In 
addition, in the foregoing Detailed Description, it can be seen that various 
features are grouped together in a single embodiment for the purpose of 
streamlining the disclosure. This method of disclosure is not to be interpreted as 

25 reflecting an intention that the claimed embodiments require more features than 
are expressly recited in each claim. Ratha:, as the following claims reflect, 
inventive subject matter lies in less than all features of a single disclosed 
embodiment. Thus the following claims are hereby incorporated into the 
Detailed Description, with each claim standing on its own as a separate 

30 embodiment. 
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CLAIMS 

1 . A method of compressing XML documents, the method comprising: 
compressing a first XML document into a binary stream; 
converting the binary stream into a compressed valid XML document; 

and 

associating at least one XML tag with the compressed valid XML 
document, wherein the XML tag identifies ttie document as a compressed XML 
document. 

2. The method of claim 1 , wherein compressing the first XML document 
into a binary stream includes compressing the XML document using a deflate 
compression algorithm. 

3. The method of claim I, wherein convating the binary stream into the 
compressed valid XML document includes converting the binary stream to 
ASCn text using base-64 encoding. 

4. The method of claim 1 , wherein converting the binary stream into the 
compressed valid XML document includes replacing invalid XML characters 
with standard XML replacement text 

5. The method of claim 1 , wherein the first XML document includes a 
configuration file to configure a remote device. 

6. A method of transferring XML documents, the method further 
comprising: 

compressing a first XML document into a binary stream; 
converting the binary stream into a compressed valid XML document; 
transferring the compressed valid XML document over a network; 
reconverting the compressed valid XML document into a binary stream; 

and 

decompressing the binary stream to obtain the first XML document. 



wo 2006/017804 



PCT/US2005/028051 



7. The method of claim 6, wherein conv^ng the binary stream into a 
compressed valid XML document includes associating at least one XML tag 
with the compressed valid XML document, wherein the XML tag identifies the 
docummt as a compressed XML document 

8. The method of claim 6, wherein reconv^ing the compressed valid XML 
document into a binary stream includes reconverting standard XML replacement 
text back to original characters. 

9. The method of claim 6, wh^ein transferring the compressed valid XML 
document over a network includes transferring the compressed valid XML 
document over a serial communications network. 

1 0. The method of claim 6, wherein transferring tlie compressed XML 
document over a network includes transferring the compressed valid XML 
document over a wireless network. 

1 1 . The method of claim 6, wherein transferring the compressed XML 
document over a network includes transferring the compressed valid XML 
document over the intemet. 

12. A computer readable medium to implement a method of compressing 
XML documents, the computer readable medium comprising program code for: 
compressing an XML document into a binary stream; converting the binary 
stream into a compressed valid XML document; and associating at least one 
XML tag with the compressed valid XML document, wherein the XML tag 
identifies the document as a compressed XML document 

13. The computer readable medium of claim 1 2, wherein the computer 
readable medium further includes program code for: reconverting the 
compressed valid XML document into a binary stream; and decompressing the 
binary stream into an XML document. 
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1 4. The computer readable medium of claim 1 2, wherein the program code 
includes a deflate compression algorithm. 

1 5. The computer readable medium of claim 1 2, where in the program code 
includes a binary to ASCII text encoding algorithm. 

16. A network device comprising: 
at least one processor; 

a netwoiic interlace to communicate with the at least one processor and a 
network; and 

an XML document processing module, including a compression module 
to compress XML documents into compressed valid XML documents. 

1 7. The network device of claim 1 6, wherein the XML document processing 
module includes a deflate compression algorithm. 

1 8. The network device of claim 1 7, wherein the XML document processing 
module includes a binary to ASCII text encoding algorithm. 

19. The network device of claim 1 8, wherein the binary to ASCII text 
encoding algorithm includes a base-64 encoding algorithm. 

20. The network device of claim 1 6, wherein the XML document processing 
module includes a decompression module to decompress compressed valid XML 
documents. 

2 1 . The network device of claim 1 6, wherein the network de^ce is an 
embedded device server operable to manage a remote device using XML 
documents. 

22. The network device of claim 1 6, wherein the network mterfece includes a 
serial port. 
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23. The network device of claim 1 6, wherein the network interfece includes a 
web interface. 

24. The network device of claim 1 6, wherein the network is a wireless 
network. 

25. The network device of claim 24 wherein the network device is included 
in a cell phone. 

26. The network device of claim 24 wherein the network is a wireless local 
area network (WLAN) and the network device is included in a WLAN computer 
card. 

27. A method for transmitting XML documents, the method comprising: 
compressing a first XML document into a binary stream; 
converting the binary stream into a compressed valid XML document; 
associating at least one XML tag with the compressed valid XML 

document, wherein the XML tag identifies the document as a compressed XML 
document; 

transferring the compressed valid XML document over a network; 
recognizing the transferred document as a compressed valid XML 
document; 

reconverting the compressed valid XML document into a binary stream; 

and 

decompressing the binary stream to obtain the first XML document. 

28. The method of claim 27, wherein reconverting the compressed valid 
XML document into a binary stream mcludes reverse base^4 encoding. 

29. The method of claim 28, wherein decompressing the binary stream 
includes running a reverse deflate algorithm. 
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30. The method of claun 29, wherem transferring over a network includes 
transferring over the internet. 

31. A system for communicating XML documents, the system comprising: 
a communication network; and 

at least first and second network devices to communicate over the 
network, wherein each network device includes: 
at least one processor; 

a network interface to communicate with the at least one 
processor and the network; and 

an XML document processing module, wherein the XML 
document processing module includes: 

a compression module to compress XML documents into 
compressed valid XML documents; and 

a decompression module to decompress compressed valid 

XML documents. 

32. The system of claim 31, wherein the first network device is an embedded 
device server, the first network device operable to receive a device configuration 
file as a compressed valid XML document and decompress the document. 

33 . The system of claim 3 1 , wherein the first network device is operable to 
transfer a status message as a compressed valid XML documatit to the second 
network device. 

34. The system of claim 3 1 , wherein the network is a serial communication 
network. 

35. The system of claim 3 1 , wherein the network is a wireless 
communication network. 
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<docameiit type, format -"compresiioB format^ 

(compressed docoineDt Inserted bcre) 
'^docnment lypO 

<rdr^ly verdott»'»l.(r> 
<qiiery_8ettin^ 
<boot> 

<dhc|P»ofK/dbcp> 

<lp>ia20.1A<7Ip> 

<subDe0255^S^55.(XsQbBee> 

<gaCcway>ia20.1.1<^gatemiy> 
</bool> 
<scria> 

<bflnd>j)600</ba«d> 

<datablts>8'^daUbitt> 

<;stopb(Cs>l</stopbtt$> 

<|iarify>iio&e^parity> 

<flovrcontrol>softwarc</Jlowcootrol> 

<descx/desO 
</serlal> 
<^qneryjsetUng> 
<rcLrep|y> 
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UEsDBBQAAAAIAHF^BCnqXlwAAALkBAAMAAAAb3JpZyS4bWxdV^ 

K\vyAQ/S/sHXICmGWXZjdqxSTr>pCcKxOOnKPHrg2JC^^ 

Tnirl6UkyO/frVrViiaFX6eqgsincb4niEoH+vkliZbJ^^ 

VkOD2hlCftWiv9Crbiis9q2Ab2qletDhSY7^ol6E]aJea^ 

A]ia>VmzrJOF3(ndi3ZtwWjrcniAS])Ty6mdGaEJ^ 

C^qTbOodSvafYxvm^IADa/wSPuviTZp/lBraLAQIV^ 

lwAAAU£BAAAIAAAAAAAAAA£AIAC2gQAAAABraDloIjibtbFBUQYAAA^ 

AAQABADYAAAI>9AAAAAAA== 

</i«Lr€ply> 
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□ SKEWED/SLANTED IMAGES 

□ COLOR OR BLACK AND WHITE PHOTOGRAPHS 

□ GRAY SCALE DOCUMENTS 

□ LINES OR MARKS ON ORIGINAL DOCUMENT 
l2fREFERENCE(S) OR EXHIBIT(S) SUBMITTED ARE POOR QUALITY 

□ OTHER: 

IMAGES ARE BEST AVAILABLE COPY. 
As rescanning these documents will not correct the image 
problems checked, please do not report these problems to 
the IFW Image Problem Mailbox. 



